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Ergodic Theorems for Lower Probabilities 

S.Cerreia-Vioglio, F. Maccheroni, and M. Marinacci 

Abstract. We establish an Ergodic Theorem for lower probabilities, a gen¬ 
eralization of standard probabilities widely used in applications. As a by¬ 
product, we provide a version for lower probabilities of the Strong Law of 
Large Numbers. 


1. Introduction 

The purpose of this paper is to state and prove an Ergodic Theorem for lower 
probabilities: a class of monotone set functions that are not necessarily additive 
and are widely used in applications where standard additive probabilities turn out 
to be inadequate (for applications in Economics see Marinacci and Montrucchio 
for applications in Statistics see Walley |22]1. 

We consider a measurable space endowed with an J^\J^-measurable 

transformation r : —>■ and a (continuous) lower probability ^ [0, !]■ We 
study four different notions of invariance for lower probabilities (Definitions [TJH]). 
They are equivalent in the additive case, and so are genuine generalizations to the 
nonadditive setting of the usual concept of invariance. 

The most natural definition of invariance for a lower probability iz (Definition 
[T]) requires that 

V (A) = V (r“^ (A)) VA £ T. 

It is the weakest form of invariance for the nonadditive case. Nevertheless, it is 
still possible to derive a version of the Ergodic Theorem (Theorem [ 5 ]) . In other 
words, if is an invariant lower probability, then for each real valued, bounded, 
and measurable function / : D —^ R the limit 

n 

lim — / o (uj) 

fc=i 

exists on a set that has measure I with respect to v. If, in addition, v is ergodic, 
we are able to provide bounds for such limit in terms of lower and upper Choquet 
integrals. 

Under the stronger notions of invariance (Definitions [2]l4]), the previous result 
can be strengthened in several ways. First, we develop a nonadditive version of 
Kingman’s super-subadditive ergodic theorem (Theorem |3]). Second, when (D, 
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is a standard measurable space we can better characterize the limit of time averages 
(Corollary [2|). 

As an application of our main result, we establish a nonadditive version of the 
Strong Law of Large Numbers (Theorem [4]) for stationary and ergodic processes. 

2. Mathematical Preliminaries 

2.1. Set functions. Consider a measurable space (S', E), where S is a nonempty 
set and S is a cr-algebra of subsets of S. Subsets of S are understood to be in E 
even when not stated explicitly. A set function ^ : E ^ [0,1] is 

(i) a capacity if v (0) = u (S) = 1, and u (A) < v {B) for all A and B such 
that A C B] 

(ii) convex ii v {AU B) + v {Ar\ B) > v [A) + v (B) for all A and B\ 

(hi) additive v {AVJ B) = v (A) + v (B) for all disjoint A and B; 

(iv) continuous if lim„^oo (A„) = v (A) whenever either A„ A or A„ j" A; 

(v) continuous at S if lim„^oo ty (A„) = u (S) whenever A„ t S; 

(vi) a probability if it is an additive capacity; 

(vii) a probability measure if it is a probability which is continuous at S. 

We denote by A (S, E) the set of all probabilities on E and by (S, E) the set 
of all probability measures on E. We endow both sets with the relative topology 
induced by the weak* topologyQ Given M. C A'^(S', E), we assume that M is 
endowed with the cr-algebra Am which is the smallest cr-algebra that makes the 
evaluations P ^ P (A) measurable for all A G E. A set function i/ : E —[0,1] is 

(viii) a lower probability (measure) if there exists a compact set M C A”’ (S', E) 
such that 

v (A) = min P (A) VA G E. 

P€M 


Given a capacity v, its conjugate re : E ^ [0,1] is given by 

v{A) = l-iy {A^) VA G E. 

It is immediate to verify that if zc is a lower probability, then 

(2.1) (A) = max P (A) VA G E. 

P^M 

The core of a capacity v is the weak* compact set defined by 
core (p) = {P G A (S, E) : P > v} , 

that is, the core is the collection of all probabilities that setwise dominate p. A 
capacity re : E —[0,1] is 

(ix) exact if core (p) ^ 0 and p (A) = minpgcore(!y) P (A) for each A. 


^Recall that a net converges to P, in the weak* topology, if and only if Pc (A) 

P (A) for all A € S. The weak* topology is thus the restriction to A (S', S) of the topology 
a (ba (S, E), B (S, E)) where B (S, E) is the space of all real valued, bounded, and ^.-measurable 
functions on S and ba (S, E) is the set of all bounded and finitely additive set functions on E. In 
the case of S being a Polish space and E the Borel cr-algebra, the above topology should not be 
confused with the topology generated by real valued, bounded, and continuous functions on S. 
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If 1 / is a convex capacity continuous at S, then v is exact and 0 ^ core {v) C 
{S, E) (see O Lemma 2 and Theorem 1], [201 Theorem 3.2], and m Theorem 
4.2 and Theorem 4.7]). In particular, i/ is a lower probability where M = core(i^). 
Conversely, if j/ is a lower probability, then v is exact, continuous at S, and M C 
core (ly) C A'^ (S, E). Nevertheless, being exact does not automatically imply being 
convex. An exact capacity continuous at S is continuous. Finally, we say that a 
statement about a random element holds v — a.s. if and only if there exists an event 
A such that v (A) = 1 and the statement holds for all s € A. 


2.2. Integrals. We denote by B (S', E) the set of all bounded and E-measurable 
functions from S to R. A capacity v induces a functional on S (S, E) via the Cho- 
quet integral, defined for all / G i? (S, E) by: 


fdv = V ({s € S ■. f {s) > t}) dt + / [ly ({s G S : f (s) > t}) — i/ (S)] dt 


where the right hand side integrals are (improper) Riemann integrals. If ly is addi¬ 
tive, then the Choquet integral reduces to the standard additive integral. It is also 
routine to check that — fdv = fg —fdly for all f G B (S, E). It is well known 
(see O Lemma 2], [211 Proposition 3], and [171 Theorem 4.7]) that if is a convex 
capacity, then 


fdiy = min 

' PGcore(i/) , 


fdP and 


/ fdu = max 

Is PGcore(j/), 


fdP V/gR(S,E). 


In the rest of the paper, we consider three measurable spaces (S, E). The first 
one is (II,.7^) which we interpret as the space where ultimately uncertainty lives. 
Given a set V C A'’' (11,7^), the second space will be {V,A-p) which we interpret 
as the space of all possible probability models equipped with the cr-algebra A-p 
discussed above. Finally, given a real valued and J^-measurable stochastic process 
{/n}„gN consider the space (R”,cr (C)), which we will interpret as 

the space of observations endowed with the cr-algebra generated by the algebra of 
cylinders C. 


2.3. Prior and Predictive Capacities. Given a set P C A®^ (12,7^), a prior 
is a capacity p : Ap [0,1]. The associated predictive is the capacity Vp ■. F ^ 
[0,1] defined by 

Vp (A) = [ P (A) dp (P) VA G F. 

Jp 

If p is additive and continuous at P, then p is a prior and Pp is a predictive in the 
traditional sense. We denote capacities that are additive and continuous at V by 
TT. Given a set P, we denote the set of strong extreme points of P by <S (P)0 


3. Ergodic Theorems 

3.1. Invariant Capacities. In this section, we consider a measurable space 
(12,7^). We also consider a transformation r : 12 ^ 12 which is P/P-measurable. 
Recall that a probability measure P is (r-)invariant if and only if 

(3.1) P(A) = P(r-i (A)) VAgP. 

^Recall that P G P is a strong extreme point of V if and only if the Dirac at P (i.e., 5p) is 
the only probability measure tt : Av [0,1] such that P (A) = f^Q (A) dir (Q) for each A £ P. 
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We denote by X the set of all probability measures that satisfy dSlD and by Q the set 
of all invariant events of T, that is, ^ S if and only \i A & F and (yl) = A. An 
invariant probability measure P is said to be ergodic if and only if P (Q) = {0,1}. 
Similarly, we say that a capacity i' is ergodic if and only if i/ (Q) = {0,1}. We 
denote by S (X) the subset of X such that 

5(I) = {PGX:P(g) = {0,l}}. 

If (fl, F) is a standard measurable space, then it can be checked that S iX) is the 
set of strong extreme points of X (see Dynkin [12j l. Finally, following Dunford and 
Schwartz [111 pp. 723-724] (see also Dowker [9]), we say that a probability measure 
P is potentially (r-)invariant if and only if there exists a probability measure P G X 
such that 

P{E) = P (E) VE G g. 

We denote the set of potentially invariant probability measures by PX. 

Next, we propose four notions of (r-)invariance for a capacity. 

Definition 1. A capacity v is invariant if and only if for each A G F 

v{A) = iy{T-^ (A)) . 

Definition 2. A capacity v is strongly invariant if and only if for each A G F 

V (A\t“^ (A)) = V {t~^ (A) \A) and v (A) \A) = i> (A\t“^ (A)) . 

Definition 3. A lower probability v is functionally invariant if and only if 
MCX. 

The fourth definition also describes a procedure in which invariant capacities 
can be constructed. Such a procedure is a robust Bayesian procedure (see Berger 
[2] and Shafer |19jb 

Definition 4. A capacity v is robustly invariant if and only if v = Vp for some 
convex capacity p : As(x) [Oj Ij- 

It can be shown that if (12, F) is a standard measurable space and v is robustly 
invariant and continuous at fl, then it is a lower probability. In the next two results, 
we will clarify the connection between these four notions of invariance. 

Proposition 1. Let (fl,F) be a standard measurable space and v a lower 
probability. The following statements are true: 

(1) Ifuis strongly invariant, then v is functionally invariant and core (v) C X. 

(2) If V is robustly invariant, then v is functionally invariant. 

(3) If V is functionally invariant and M G As{x), then v is robustly invariant 
and ergodic. 

(4) If V is functionally invariant, then v is invariant. 

The connection among some of these notions of invariance becomes sharper 
when V is convex. 

Theorem 1. Let (fl, F) be a standard measurable space and v a convex capacity 
continuous at fl. The following statements are equivalent: 

(i) n is strongly invariant; 

(ii) n is functionally invariant and core {v) C X; 
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(iii) V robustly invariant and core {v) C X; 

(iv) core {i') C X. 

As a corollary, we obtain that the four definitions coincide with the usual defi¬ 
nition of invariance when i/ is a probability measure. Under additional assumptions 
on U and r, in the additive case, the equivalence between points (i) and (iii) follows 
by an application of the Choquet-Bishop-de Leeuw theorem (see Phelps |18j l. In 
our case, the equivalence between points (i) and (iii) could be proven by develop¬ 
ing a nonadditive version of the Choquet-Bishop-de Leeuw theorem. This can be 
achieved by using the techniques contained in Cerreia-Vioglio, Maccheroni, Mari- 
nacci, and Montrucchio [^. Finally, in the next section, we show that, if v is an 
invariant lower probability, then its core must be contained in XX. 

3.2. Ergodic Theorem. Given the notions of invariance previously discussed, 
we could then ask ourselves if suitable ergodic theorems can be developed for non¬ 
additive probabilities. In light of Proposition [T] and Theorem [I] an immediate 
dichotomy presents. In fact, the notion of invariance of Definition [T] stands sepa¬ 
rate from, and it is actually weaker than, the other notions of strong, robust, and 
functional invariance, even in the convex case. Theorem [2] only assumes the weak 
form of invariance of Definition [T] On the other hand, Corollary assumes strong 
invariance. Strong invariance, paired with the convexity of v and (fi, X) being 
standard, allows us to provide a sharper version of Theorem [2l 

Theorem 2. Let (D,X) he a measurable space and v a lower probability. If v 
is invariant, then for each f € B (D, X) there exists f* € B (D, Q) such that 

n 

Wra-^f {oj)) = f* iuj) v - a.s. 

fc=l 

Moreover, if v is ergodic, then 



As a corollary, we are able to show a necessary property that core {v) of an 
invariant lower probability v must satisfy (cf. Proposition [1} . Clearly, it is not a 
characterization since it is well known that there are probability measures that are 
potentially invariant, but not invariant. 

Corollary I. If a lower probability v is invariant, then core(iz) C XX. 

As a second corollary, we discuss the ergodic theorem for convex and strongly 
invariant capacities. Compared to Theorem [2l the following corollary assumes v 
convex and a stronger form of invariance that, in turn, yield a limit function /* 
which has more properties. These properties naturally generalize the ones found in 
the Individual Ergodic Theorem of Birkhoff. In this case, convergence of empirical 
averages is a simple consequence of Birkhoff’s theorem applied to each probability 
in core(z/). Nevertheless, the relation between / and /* in terms of Choquet ex¬ 
pectations is not immediate at first sight. A similar comment applies to Theorem 

H 

Corollary 2. Let (f2,X) he a standard measurable space and v a convex ca¬ 
pacity continuous at LI. If v is strongly invariant, then for each f G B (D,X) there 
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exists f* € B (Q,Q) such that 

n 

(3.2) lim - V / (w)) =/* (w) v - a.s. 

k=l 


Moreover, 

(1) For each P G I, f* is a version of the conditional expectation of f given 

Q. 

( 2 ) In f*d^ = In 

(3) If V is ergodic, then 



3.3. Subadditive Ergodic Theorem. Next we turn to a Subadditive Er¬ 
godic Theorem for lower probabilities. 


Definition 5. A sequence of F-measurable random variables is su¬ 

peradditive (resp., subadditive) if and only if 

Sn+k Sn + SkO r” (resp., < ) Vn, k gN. 

The sequence {S'n}„gpj is additive if and only if it is superadditive and subadditive. 

Consider an measurable function / : D —>■ K. If we define by 

n 

(3.3) ^ / o Vn £ N, 

fc=i 

then we have that is an additive sequence. The opposite is also true, that 

is, if {5'ra}^gpj is additive, then it takes the form (13.311 for some J^-measurable real 
valued function /. On the other hand, if we take as in dSSD and we consider 

obtain a genuine subadditive sequence. Note that if / G B{n,F), 
then we also have that there exists A £ K such that 


(3.4) — Xn < Sn {oj) < An Vw £ D. 

Similarly, we have that — An < jS’^j < An for all n £ N. 


Theorem 3. Let {Q,F) be a standard measurable space and v a lower proba¬ 
bility. If {Sn}n^n either a superadditive or a subadditive sequence that satisfies 
and if V is functionally invariant, then there exists f* G B such that 


n n 


v — a.s. 


Moreover, 

(1) If V is convex and strongly invariant and superadditive, then 

J^f*du = snPnmJa^dv- 

(2) Ifv is convex and strongly invariant and subadditive, then f*dv 

infn/n ^dt>. 

(3) If V is ergodic and {S'ra}„gN either subadditive or superadditive, then 


ui G XI : / Vdiy < lim 


Sn (to) 


< / rdk 


V 


n 


= 1 . 
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4. Strong Law of Large Numbers 

As an application of Theorem [2l we provide a nonadditive version of the Strong 
Law of Large Numbers. Before doing so, we need to introduce some notation 
and terminology. Consider a sequence of real valued, bounded, and measurable 
random variables f = {fn}nen ^ We denote by T the tail tr-algebra 

n crifkjk+i,-)- 

feSN 

Definition 6. Given a capacity v, we say that f = {/ra}„gi!j is stationary if 
and only if for eaeh n G N, for eaeh k G No, and for eaeh Borel subset B ofMf''^^ 

(4.1) 

1/ ({w € : (/„ (w),..., fn+k (w)) G B}) = n {{uj G : (/„+i (w) ,..., fn+k+i (w)) G B }). 

This notion generalizes the usual notion of stationary stochastic process by 
allowing for the nonadditivity of the underlying probability measure. Recall that 
(]R^,cr(C)) denotes the space of sequences endowed with the cr-algebra generated 
by the algebra of cylinders. We denote a generic element of by x. We also 
consider the shift transformation r ; —>■ R^ defined by 

r (x) = (x2, X3, X4,.) Vx G R”. 

The sequence {fn}n^^ induces a natural (measurable) map between (11,-A) and 
(R”, a (C)), defined by 

w 1-^ f (w) = (/i (w),..., fk (w),...) Vw G H. 

Define Vf : cr (C) —>• [0,1] by 

z/f (C) = n (f-i (C)) VCGa (C). 

Definition 7 . Given a capacity v, we say that f = is ergodic if and 

only if Vf is ergodie with respeet to the shift transformation. 

Lemma 1. If v is a convex capacity continuous at Q and f is stationary, then 
Vf is a convex capacity continuous at R^ which is shift invariant. Moreover, f is 
ergodic if v {T) = {0,1}. 

This observation is a first step to deduce the Strong Law of Large Numbers as a 
corollary of Theorem [2] applied to z/f. In a nutshell, the assumption of stationarity 
yields that the limit 

1 " 

lim -^fk 

fe=l 

exists ly-a.s. In order to obtain also a characterization of the limit in terms of the 
(Choquet) expected value, we further need rf to be ergodic. 

Theorem 4. Let v be a convex capacity continuous at fl. // f = {/n}„gN is 
stationary and ergodic, then 



We close by observing that there are few but important differences with the 
nonadditive Strong Law of Large Numbers of Marinacci |16] and Maccheroni and 
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Marinacci [15] . In terms of hypotheses, we weaken the assumption of total mono¬ 
tonicity of V to convexity, while we replace the i.i.d hypothesis of |16j with sta- 
tionarity and ergodicity. Finally, compared to the main result of we need to 
assume the continuity of v. In turn, we obtain that empirical averages exist v-a.s.^ 
a property that was not present in previous works. The bounds for these empirical 
averages are in terms of the lower and the upper Choquet integrals of the random 
variable /i, as in |16) and |15j . 


Appendix A. Dynkin Spaces and Nonadditive Probabilities 

Consider a standard measurable space and a transformation r : 17 —>■ 17 

which is T\J- measurable. Recall that we denote by I the set of all invariant 
probability measures. If I is a nonempty set, then the triple (f7,J^,X) forms a 
Dynkin space. 

Definition 8 (Dynkin, 1978). LetV be a nonempty subset o/A°’ (17, X) where 
(17,-F) is a separable measurable space. The triple is a Dynkin space if 

and only if there exist a sub-a-algebra Q QT, a set W ^ T, and a function 

p: TxQ [0,1] 

{A,uj) I—>■ p{A,uj) 

such that: 

(a) for each P G V and A € J-, p (A, •) : 17 —>■ [0,1] is a version of the 
conditional probability of A given Q; 

(b) for each uj &Q., p{-,uj) : X" — >■ [0,1] is a probability measure; 

(c) P {W) = 1 for all P GV and p{-,uj) for all tv G W. 

It is not hard to check that, given / G B{D,P), the function / : 17 R, 
dehned by 

(A.l) f{uj)= [ fdp{-,uj) Vw G 17, 

Jn 

is a version of the conditional expected value of / given Q for all P € P,m particular, 
f G B (17, Q) (see also [6l Remark 13]). When (17, X) is a standard measurable space, 
if (I7,X, X) = (f7,X,X), then Q is the set of invariant events. In particular, we can 
consider W = D (see Gray m Theorem 8.3]). We conclude with an ancillary 
lemma. 


Lemma 2. Let (17, X) be a measurable space and Q a sub-a-algebra of T. If v 
is a lower probability such that v [Q) = {0,1} and g G B (17, Q), then 


UJ G 


17: f gdiy<g{uj)< f gdv}]=l. 

J Q JQ. 


Proof. We proceed by assuming that g > 0. Since is a capacity such that 
v{Q) = {0,1} and 0 < g < A for some A G K, it follows that the sets 

I = {t G [0, c») : V ({w G LI : g (w) > 7}) = 1} 
and 

J = {t G (— 00 , 0] : n ({w G fl : —g (cv) > t}) = 1} 

are well defined nonempty intervals. / is bounded from above and such that 0 G /. 
J is unbounded from below and such that —A G J. Since is a lower probability. 
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V is continuous. We can conclude that t* = sup I € I and t* = sup J £ J. Since 
(G) — {0,1}, this implies that 

P POO psup I 

/ gdi^ = V ({w £ Gl : g (w) > t}) dt = 

JQ Jo Jo 


dt = t* 


and 


p pO pO 

/ —gdv= / [v {{lu £ Gl :—g {uj) > t}) — V {Gl)]dt = / (—= 

^ Cl ^— oo ^ sup J 

It follows that t* = gdv and —gdv. Since t* £ I and t* £ J, we also 

have that 

u ({w £ Gl : g (w) > t*}) = 1 = v ({w £ Gl : g (w) < —t*}) . 

Since n is a, lower probability, this implies that 
(A.2) 

u £ Gl : J gdv < g (w) < J = v ({w £ Gl : t* < g (w) < —t*}) = 1. 

We next remove the hypothesis that <7 > 0. Since g £ B {Gl, Q), it follows that there 
exists c S K such that g + cln > 0. By (IA.2I) and since the Choquet integral is 
constant additive, it follows that 


I = V \ luj £ 


= v\ {uj £ 


= V e 

proving the statement. 


Gl -. J {g + cln)dv <g{oj) + c< J (g + cln) 
Gl : / gdv + c < g (to) + c < / gdv + c > ) 

Jn Jq. ) J 

Gl : J gdv<g{uj)< J gdv^^ , 


Appendix B. Proofs 

Proof of Proposition [TJ Recall that if is a lower probability, we have that 
(B.l) V < P < V VP £ core (v) C A'^ (12, P). 

1. Pick A £ P. Since v is strongly invariant and v < v,we have v (A) \A) = 

v{A\t-^A)) < i>(A\r-i(A)) = v{t-^{A)\A) < (r'l (A)\A). It follows 
that 1 /(A\r“^ (A)) = P (A\r“^ (A)) = p'(r“^(A)\A) = (A)\A) = k. 

By (IB.II) . we can conclude that P (A\r“^ (A)) = k = P (t“^ (A)\A) for all 
P £ core(v). This implies that P (A) = P (A\t“^ (A)) +P(AnT“^(A)) = 
P (A) \A) + P (A ("I T~^ (A)) = P (A)) for all P £ core (v), proving the 
statement. 

2. By assumption, there exists a convex capacity p : A.s(x) [Oj 1] such that 

(B.2) v{A)= [ P (A) dp (P) = min / P (A) dn (P) VA e P. 

JS(I) 7recore(p) Js(X) 

Dehne A4 = {vtt : n £ core(p)}. By 0 Lemma 24] and (IB.21) and since v is con¬ 
tinuous at 12, we have that p is continuous at <S {!), thus, each tt in core (p) is a 
probability measure and AI is a compact subset of A'^ (12, P). Moreover, we also 
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have that M.QX. We can conclude that v (A) = niin^gcore(p) Js(i) ^ (^) ~ 

minpg^ P (A) for all A £ P, proving the statement. 

3. Fix M G As(i)- Consider p : As(i) [0,1] defined by 

f 1 FAM , 

otherwise ^ Mp- 

It is immediate to check that p is a convex capacity. By m Example 4.4] and 
since M G As(i), we have that v (A) = minpg^vi P (^) = fs(i) ^ (^) 

A G P. Since M Q S (I), observe that P {A) G {0,1} for all P G tVI and for all 
A G Q. It follows that u (Q) — {0,1}. 

4. Since p is a functionally invariant lower probability, we have that M Q I 

and V (A) = minpgvvi P {A) = minpg^^ P (A)) = v (r“^ (A)) for all A G P, 
proving that v is invariant. I 

Proof of Theorem [TJ Recall that if ii is convex and continuous at 17, then it is 
a lower probability. 

(i) implies (ii). It follows by point 1 of Proposition [TJ 

(ii) implies (iii). We just need to show that p is robustly invariant. Define 
I : B{n,P) -^Rhy 

i{f)= [ fdj. yfGB{n,p). 

Jn 

By Schmeidler [21] (see also ini), I is comonotonic additive and supermodular. 
Since v is convex, we have that I (/) = minpgcore(;y) Jq fdP for all / G P (D,P). 
Since core (i/) C I, this implies that if fidP > f 2 dP for all P G I, then 

I ifi) > I (/ 2 )- In particular, I [f) = I for all / G P (D, P). It is also immedi¬ 
ate to see that / (fclo) = fc for all fc G K. It follows that I restricted to B {ft, Q) is 
normalized, comonotonic additive, supermodular, and such that fidP > f 2 dP 
for all P G I implies I {fi)> I {f 2 )- By Lemma 24 and Proposition 25] and since 
(D,P,I) is a Dynkin space, it follows that there exists I : B {S (X) ,^ 5 ( 1 )) —>■ M 
such that I is normalized, monotone, comonotonic additive, supermodular, and 
such that I (f) = i ((/, •)) for all / G P (D, Q). By [21] (see also |17]1. it follows 
that there exists a convex capacity p : As{i) [ 0 , 1 ] such that 

(B.3) /(/)=/ ([ fdp)dp{P) V/GP(D,e). 

Jsix) \Jn J 

Since / (/) = / for all / G P (D,P), it follows that (IB. 31) holds for all / G 

P(D,P). In particular, by picking f = 1 a with A G P, this shows that ly is 
robustly invariant. 


(iii) implies (iv). It is trivial. 

(iv) implies (i). Since v is convex and core(z/) C X, it follows that 


(A\r-i(A))+z.(Au(r-i(A))^) 


/ (in + 1 a - 1t-i(A)) 

Jn 

in / (in + 1 a - 1t-i(A)) dP = 1. 

're(i') Jq 


min 

P^corei 
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Thus, we have that 

V (A)) = \-v (a U (r-i (A))'") = \-v (A) \A)'') = v (t"! (A) \A). 

An analogous argument yields that v (A) \A) = v (A\t“^ (^))) proving the 
statement. I 

Before proving Theorem [51 we provide an ancillary key result. 

Theorem 5. Let be a measurable spaee, v a lower probability, and as¬ 

sume that the family X of invariant probability measures is not empty. The following 
statements are equivalent: 

(i) There exists P €X such that for each E G T 

P{E) = 1 lim ly (t-’^ (E)) = 1; 

k 

(ii) There exists P €l such that for each E G Q 

P{E) = 1 iy{E) = l; 

(hi) For each E G Q 

P{E) = 1 \fP Gl ^ v{E) = 1-, 

(iv) For each f G B there exists f* G B {Ll,G) sueh that 

1 ” 

lim - ^ / (w)) =/* (w) ly-a.s.] 

(v) core{u) C PX. 

Proof, (i) implies (ii). li E G Q, then {E) = E for all fc € N, yielding the 
statement. 

(ii) implies (hi). It is trivial. 

(hi) implies (iv). Consider f G B Define /* : D ^ R by 

1 ” 

/* (w) = lim sup — (uj)) \/uj G Lt. 

k—1 

Define /* : D —^ R by considering the liminf. Since f G B it can be shown 

that /*, fi, G B [Ft, G). Consider the event 

A = |a; S D : lim — (w)) exists| = {oj G Ft : f* (w) = /* (w)} 

n 

w G D : /* (w) = lim - V / (w)) = /* (w) 

k=l 

By Birkhoff’s Ergodic Theorem (see [H Theorem 24.1]), we have that P [E) = 1 
for all P gX. By assumption, this yields that v [E) = 1. Since / was chosen to be 
generic, the statement follows. 

(iv) implies (v). Recall that for each P G core [v), P (A) > j/ (A) for all AgT . 
By assumption, we can conclude that for each P G core [ly), for each f G B [Ft, P) 
there exists f* G B [Ft,G) such that 

n 

n 77, ^ ^ ' 



P — a.s. 
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By [141 p. 964] (see also [111 Exercises 31 and 32, pp. 723-724]), it follows that 
P&VX. 

(v) implies (i). Since ly is a lower probability, it is continuous at 12 and exact. 
By [m Theorem 4.2], it follows that there exists a measure P € core(i^) such that 
for each A G P, for each e > 0, there exists 5 > 0 such that 

(B.4) P (A) < 5 Q (A) < e VQ £ core (ly). 

It is immediate to show that P is such that for each A G T 
(B.5) P(A) = 0 ^ Q(A) = 0 VQecore(i^). 

Since P G core (ly) C PZ, we have that there exists P gZ such that P {E) = P (E) 
for all E G Q. Consider E G Z. Assume that P (E) = 1. It follows that P (i?°) = 0. 
At the same time, define P„ = Note that Fn I F G G. Since 

P e X, it follows that P(P) = lim„P(P„) < P(Pi) < ^ = 

0. It follows that P (F) = 0, that is, P (F) = 0. By (IB. 51) . we have that 
Q{F) = 0 for all Q G core (ly), that is, P (E) = 0. Since ly is a lower probabil¬ 
ity, P satisfies the Eaton’s property, that is, 0 < limsup^. P (Afc) < P (limsupj, A^) 
for each sequence {Afc}j,gpj C P. This implies that 0 < liminf/; P (r”'” (P°)) < 
limsupj, P (P'^)) < P (limsupj, (P'^)) = P {P) = 0. We can conclude that 
limfc V (t“^ (P)) = limfc [l — P (t“^ (P'^))] = 1, proving the statement. ■ 

The proof of Theorem|2]uses some of the techniques common in Ergodic Theory 
(see, e.g., [8| Theorem 7]). Also, note that, given a capacity v, we have that 

core (ly) = {P G A (12, P) : P > P > = {P G A (12, P) : P > P} . 

Proof of Theorem We first prove that, given the assumptions, 0 7 ^core(^) C 
PZ. In particular, this shows that X 7 ^ 0. 

Claim: Let v be a lower probability. If v is invariant, then core (ly) C PZ. In 
particular, X 7 ^ 0. 

Proof of the Claim. Since ly is invariant, P is invariant. Since v is a lower 
probability, v is continuous at 12 and, in particular, 0 core (ly) C A*^ (12, P). Fix 
a Banach-Mazur limit (see [I] pag. 550]) —>■ M, that is, a functional from 

to R such that: 

( 1 ) ^ is linear; 

( 2 ) (j) is positive; 

(3) (l){xi,X 2 ,...) = 4>{x2,X3...) for all x G l°°; 

(4) (l){xi,X 2 ,...) = limjj Xn for all a: £ c. 

Observe that v (A) < P (A) < P (A) for all P £ core (ly) and all A £ P. Fix 
P £ core {ly), define Pn ■ P ^ [0,1] by 

. n—1 

Pn (A) = - ^ P (A)) VA £ P. 

fc =0 

Note that P (t“^ (A)) < P {t~^ (A)) = P (A) for all A £ P and for all fc £ Nq. 
Since core(i^) is convex, this implies that {Pn}„gN ^ core(i^). For each A £ P, 
define xa = (Pi (A), P 2 (A), P 3 (A),...). Note that 0 < < In, thus, xa £ 

for all A £ P. Dehne P : P —>■ [0,1] by P (A) = (j){xA) for all A £ P. 
Since (f is positive, note that P is a well defined positive set function. Next, 
consider A, P £ P such that A (b P = 0. Since {Pn}„gpj C A (12, P), it follows that 
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Pn {AU B) — Pn (v4) + Pn (B) for all n € N. Since (j) is linear, this implies that 
P{A[JB) = 4> {xaub) = </> (xa + xb) = (p {xa) + 4> {xb) = P {A) + P (B), proving 
that P is additive. Next, consider A £ Q. Since (A) = A for all k G No, it 
follows that Pn (A) = P {A) for all n G N. Since (j) maps convergent sequences 
into their limit, we have that P (A) = (j){xA) = P{A). In particular, this implies 
that P (n) = 1 and P (0) = 0. Up to now, we have proved that P G A (fl, P) and 
P (A) = P (A) for all A £ G- Since {Pn}„gi!j C core (z/), we have that xa < v (A) l^- 
Since (p is linear and positive, it follows that P (A) = (p {xa) < </> (A) In) = v (A) 

for all A £ P, that is, P £ core(z/). Since core(z^) C (U,P), we can conclude 
that P £ A”’ (U,P). We next show that P is invariant. Note that for each A £ P 
and for each n G N 


Pn 


{A)) = 


n—1 

.^P(r-'=-i 


(^)) = 


n + 1 


1 


k=0 
n + 1 


n + 1 


n 


k=0 


P 


n+1 (A)--P(A). 
n 


Define y = (P 2 (A) ,P 3 (A),...). Define z = x^-i(a) — y S l°°. Note that |z„| = 
\Pn (r-1 (A)) - Pn+i (A) I < i \Pn+i (A) - P (A)| < f for all n G N. It fol¬ 
lows that lim„ z„ = 0. Since (p satisfies properties 3, I, and 4, we have that 
P (r-i (A)) - P(A) = \(p (x.,-l(A)) -P{xa)\ = \(p [xr-^iA)) -(p{y)\ = \(p{z)\ = 

0, proving that P is invariant. Given the previous part of the proof, P £ X and 
P G PI. Since P was arbitrarily chosen in core(z^), it follows that I 7 ^ 0 and 
core {v) C PI. □ 

By the previous claim and Theorem [SI the main statement follows. 

Finally, assume that v is further ergodic. By Lemma|2]and since f* £ B (U, G) 
and k' is an ergodic lower probability, it follows that 


ly { <uj £ 


n ^ /*dz. < /* (oj) < = 1. 


Since i' G U : /* (w) = lim„ i ^ ^ (w)) [ j = 1 and is a lower prob¬ 

ability, this implies that 


k=l 



proving the statement. ■ 

Proof of Corollary |T1 It is the proof of the claim contained in the proof of 
Theorem |5J ■ 

We next proceed by proving Theorem |3| and obtaining Corollary |2| as a corol¬ 
lary of this former result. It is also possible to provide a proof of Corollary [5] as a 
consequence of Theorem |21 By Theorem [51 the extra assumption of {Gt,P) being 
standard yields the extra property that /* can be chosen to be the regular condi¬ 
tional expectation of /. Convexity and strong invariance imply that core (;/) C I. 
This yields that f*di^ = Jq fdv as well as f*dv = fdv. This, in turn, 
yields a sharper result under the assumption of v being ergodic. 







14 


S.CERREIA-VIOGLIO, F. MACCHERONI, AND M. MARINACCI 


Lemma 3. Let {5'„}^gpj he a superadditive (resp., subadditive) sequence that 
satisfies and A4 a compact subset of invariant probability measures. If {an}„gi^ 
in K is defined by an = — minpg_A4 Jq SndP (resp., an = maxpgTn SndP) for all 
n S N, then is subadditive, that is, a„+fe < a^, + for all n,k gN. 

Proof. Since satisfies (13.41) . C B{Lt,P). We just prove the 

statement for the superadditive case, being the subadditive one similarly proven. 
If {iSralngN is superadditive and Af is a compact subset of invariant probability 
measures, then we have that —On+k = minpgTvi Jq Sn+kdP > minpg_A 4 Jq Sn + 
Sk o T^dP > minpg^ SndP + minpg^ Sk o r'^dP = minpgx SndP + 
minpg^vi Jq SkdP = —an — ak for all n,k gN, proving the statement. ■ 

Proof of Theorem (3], Since is a functionally invariant lower probability, we have 
that Af C X. Define {/n}„gN — ^ by fn = Sn/n for all n S N. It follows 

that fn & B {^,G) for all n € N. Since {-SnlngN satisfies (13.41) . it follows that there 
exists A G R such that —A < /„, /n < A for all n G N. Define f*GB {ft, Q) by /* = 
sup„gpj/„ (resp., /* = inineN fn)- By Kingman’s Subadditive Ergodic Theorem 
(see Dudley [T^ Theorem 10.7.1] and [131 Theorem 8.4]) and since W = D, we have 
that f* = lim„ fn and P G D : lim„ — f* (w)|^ = 1 for all P G A4. Since 

1 / is a lower probability, it follows that G D : lim^ = /* (a;)|^ = 

proving the main part of the statement. 

1. If is convex and strongly invariant, then we have that core [n) C X and 

(B.6) f fdv= min j fdP 'dfGB{Q,P). 

Jq PGcore(jv) Jq 

Consider the sequence {an}„gN defined by a„ = — Jq Sndv for all n G N. By (IB.Op 
and LemmalU we have that {an}„gi^ is subadditive. It follows that (see [131 Lemma 
8.3]) lim„ ^ = inf„gN that is. 


(B.7) 


lim — 

n n 


sup- 

n^N 


Recall that \ fn\ is uniformly bounded. By Cerreia-Vioglio, Maccheroni, Mari- 
l J neN 

nacci, and Montrucchio [4l Theorem 22], (IB.7p . and the main part of the statement 
and since core (p) C X, we have that 


/ f*dv = / lim fndv = lim / fndv = lim 

In Jq ^ ^ Jq 


mm / 

_Pecore(i/) Jq 


= lim 


min / 

_Pecore(jv) Jq 


fndP 


= lim / fndi' = lim 


fndP 

In Sndi^ 


-an 


an 


= lim-= sup ■ 

" n „6N n 


= sup ■ 


IQ 
Jq Sndn 


n n 

= sup / fndv, 
kgn Jn 


proving point 1. 

2. If V is convex and strongly invariant, then we have that core (p) C X and 

(B.8) f fdv= max f fdP 'if^B{Ll,P). 

Jq P£core{u) Jq 
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Consider the sequence {an}„g|sj defined by Un = Jq Sndv. By (|B.8|) and Lemma [3l 
we have that {an}„gN is subadditive. It follows that (see |131 Lemma 8.3]) 

(B.9) lim—=inf—. 

ran ran 

Recall that \ fn\ is uniformly bounded. By [H Theorem 22], (IB.91) . and the 
I J n^N 

main part of the statement and since core (n) C I, we have that 


/ f*dv = / lim fndv = lim / fndiy = lim 

In Jn ^ ^ Jn " 


max / fndP 

Pecore(ra) Jq 


= lim 


max / fndP 

PGcore(jv) Jq 


In Sndv 


= lim / fndi' = lim 

n Jn ran 


= lim ^ = inf ^ = inf = inf / /„dn, 

ran ran ri n ragN Jq 


proving point 2. 

3. By Lemma [Hand since v is ergodic, it follows that 


ly { <uj G 


n : [ f*dv < /* (w) < [ f*dD\] = 1. 

Jn Jn }) 


By the initial part of the proof, we have that v S : /* (w) = limn = 

1. Since p is a lower probability, this implies that 

p G n : f f*du < lim ^ < f /*dp'l^ = 1, 


proving the statement. I 

Proof of Corollary]^ Pick f G B (fl, J^). It is immediate to see that 
defined by S'„ = X]fc=i / ° for all n G N, is an additive sequence which 
satisfies dOl). Since p is convex, continuous at 12, and strongly invariant, it is 
a functionally invariant lower probability. Define {/n}„gpj by /„ = Sn/n for all 
n G N. Note that fn = f for all n G N. By the proof of Theorem [31 we have that 
lim„ — = lim„/ji = /, p — a.s., proving the main statement and point 1 where 

r = /• 

2. Since p is convex and strongly invariant, then we have that core (p) C I and 
Jq fdv = minpg(,ore(i/) Jq fdP. By point 1 and since core (p) C X, we have that 
Iq fdv = minpgcore(ra) Iq fdP = minpg(,ore(ra) Iq fdP = Jq fdv, proving point 2. 
Note also that J'q fdv — maXp^Qore(ra) Iq fdP HI^^PGcore(ra) Iq fdP J'q fdv. 

3. By point 3 of Theorem [3] and the proof of point 2, the statement follows. ■ 

Proof of Lemma IH Consider a convex capacity v and a process f . It is imme¬ 
diate to see that Pf is a convex capacity. Next, consider C cr (C) such 

that Cn t It follows that the sequence {^ra}„gN’ defined by An = f~^ (C„) 
for all n G N, is such that An t Since v is continuous at D, we have that 
lim„ Pf (Cn) = lim„ v (f~^ {Cn)) = hiRra v {An) = 1, proving that Pf is continuous 
at Next, consider C G C. Then, there exist k G N and E G B (R^) such that 
C = {a; G : (xi,..., Xk) G E). Note that (C) = {a; G : (a;i, X 2 ,..., xu+i) G 
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M. X E}. Since f is stationary, it follows that 

t/f (C) = i. (f-i (C)) =,^{{ojen: (/i (w ),fk H) e i?}) 

= iy({u;e fi: (/2 (w),fk+i (w)) e E}) 

= i^({uj e fl : (/i (w) , /2 (w) , fk+i (w)) G M X E}) 

= V (f-^ (t- 1 (C))) = lyf (t-1 (C)) . 

Since C € C was arbitrarily chosen, it follows that C C {C G cr (C) : i^f (C) = 
(C))} C cr (C). Since I'f is convex and continuous at K^, we have that {C G 
cr (C) : icf (C) = icf (t~^ (C'))} is a monotone class. By the Monotone Class Theo¬ 
rem (see m Theorem 3.4]), it follows that cr(C) = {C G cr (C) ■ Vf [C) = icf (r 1 (C))}, 

oo 

that is, Vf is shift invariant. Define % = n CT (C^i) Ccr (C)ll Note that f ^ [%) C 

fc=i 

T- Thus, Vf {H) = {0,1} if u (T) = {0,1}. Let G be the cr-algebra of shift invariant 
events. It is well known that G Q 'H. In light of these observations, it is immediate 
to see that if ic (T) = {0,1}, then Vf (G) = {0,1}, that is, i'f is ergodic. ■ 

Proof of Theorem |4l By induction and since f is stationary, it follows that for 
each A: G N and for each Borel subset B of R 


(B.IO) ic ({w G D : /i (w) G B}) — v ({w & Gl : fk (w) G B}). 

By (|B.10|1 . this implies that for each A: G N and for each Borel subset B of K 

Vf ({a; G : Xfe G B}) = v ({w G Gl : fk (uj) € B}) = v ({w G D : /i (w) G B }). 

In particular, since C B {Q,E), it follows that there exists m G R such 

that —mlf 2 < fI < mlfi. If we replace B with [—m,m], then we can conclude that 
(B.ll) 

jcf ({a; G R^ : Xk G [—m, m]}) = i' ({w G ft : fi (w) G [—m, m]}) = I VA: G N. 
Dehne tt : R^ ^ R by 


TT {x) = 


Xi if Xi G [—m, m] 
0 otherwise 


Va; G 


It is immediate to see that ir G B (R^, cr (C)). Note also that 
(B.12) 

OO 

Pi {a; G R^ : a;fc G [—to, to] 

k—1 n—1 


oo 1 ^ 1 ^ 

,i)cnLeR":-i:>^(A-‘(x))=-y: 


Xk 


k=l 


k=l 


By (IB.lip and (|B.12I) and since i'f is a convex capacity which is further continuous 
at R^, it follows that 


(B.13) 


Vf 


' OO ^ 1 ^ 1 


Xk 


= 1 . 


\n—l 


fc=l 




is the class of cylinders such that 

C = ^ : (oil, xy) G R^ X 

where fc' > and G ^(R^'“^). 
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By Theorem [2] and since Vf is shift invariant and ergodic, we have that there exists 


TT* € B 

(B.14) 


, Q) such that 


: [ 'K*dvf < lim — tt (t^ ^ (a;)) = tt* (a;) < [ 'K*dvf I ] 


Vf ^ 

By (|B.13I) and (IB.14|1 and since is convex, we can conclude that 


= 1 . 


(B.15) i/f I < a: S ^ 


1 " 

‘'dvf < lim — Xk = tt * (x) < 

n n 


ir^duf > = 1. 




Let £; = {x e K'^ : linL^ELi ^ ^ i^)) = (a^)} and 7r„ = i YJk=i ^ ^ 

for all n S N. By (IB.141) . we have that P {E) = 1 for all P € core(i^f). By 
construction, C B (]R^,(t(C)) is a uniformly bounded sequence which 

converges pointwise to Ibtt*. By [H Theorem 22] and since is convex and 
P (E) = 1 for all P € core (i^f), this implies that 
(B.16) 

/ TT*di'r = / lp,'K*dvf= / lim lR7r„c?^'f = lim / 1 = lim / 'K^dvr. 

Next, since is convex and shift invariant, note that for each n G N 


/rn n 


fe=i 


fe=i 


/RN 


/R" 


By (IB.lhp . it follows that J^f,TT*di'f > J^f,Trdi^f. A similar argument yields that 
/jjN 7r*dj>f < Finally, since J^^irdi^f = fidv and /j^p, 

by 1|B.15I) . we can conclude that 


1 = lyf 


X e 


irdi^f 


= ly ^ ^ 

proving the statement. 


: / ndvf < lim — Xk < 
dRN ” ^ •'"«" ) / 

: f fidu < lim i fk (w) < [ fi^i' \ ) , 
Jn " ]) 
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